Closed Bug 1455316 Opened 7 years ago Closed 7 years ago

sometimes when test-verify finds a failure, all future tests are marked as failing also

Categories

(Testing :: General, enhancement, P1)

enhancement

Tracking

(firefox61 fixed)

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: jmaher, Assigned: jmaher)

References

(Blocks 1 open bug)

Details

(Whiteboard: [PI:April])

Attachments

(1 file, 4 obsolete files)

test-verify solves many problems, but can be misleading when the first of a few tests fail as we parse the entire output log when summarizing each test.
Blocks: test-verify
It depends -- failures detected by mozharness log parsing are persistent.
Summary: when test-verify finds a failure, all future tests are marked as failing also → sometimes when test-verify finds a failure, all future tests are marked as failing also
I don't feel this is the best approach in the world, but I am not coming up with a better way to solve this problem- would like your opinion here.
Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Attachment #8970446 - Flags: review?(gbrown)
Comment on attachment 8970446 [details] [diff] [review] keep track of existing failures when parsing the file multiple times Review of attachment 8970446 [details] [diff] [review]: ----------------------------------------------------------------- There are some implementation details missing here, but I don't hate the strategy and don't have a better suggestion -- it might just work! ::: testing/mozharness/mozharness/mozilla/structuredlog.py @@ +104,5 @@ > self.log(log_data, level=level) > self.update_levels(tbpl_level, level) > > + def _subtract_tuples(self, old, new): > + items = old.keys() + list(set(new.keys()) - set(old.keys())) Couldn't you simply have: items = set(old.keys() + new.keys()) ...or am I misunderstanding? @@ +126,4 @@ > success_codes = success_codes or [0] > summary = self.handler.summarize() > > + if previous_summary: Add a nice explanatory comment here. @@ +149,5 @@ > + # Add the current values to the new values > +# joined_summary = RunSummary(self._add_tuples(previous_summary.unexpected_statuses, summary.unexpected_statuses), > +# self._add_tuples(previous_summary.expected_statuses, summary.expected_statuses), > +# self._add_tuples(previous_summary.log_level_counts, summary.log_level_counts), > +# self._add_tuples(previous_summary.action_counts, summary.action_counts)) What's happening here with the commented-out code? @@ +207,5 @@ > # expected info from a structured log (fail count from the prior implementation > # includes unexpected passes from "todo" assertions). > + try: > + summary = self.summary > + except: Catch a specific exception instead of using except:. ::: testing/mozharness/scripts/desktop_unittest.py @@ +921,5 @@ > success_codes = [0, 1] > > + tbpl_status, log_level, summary = parser.evaluate_parser(return_code, > + success_codes, > + summary) What if parser is an DesktopUnittestOutputParser? I think you'll need to do the same for Android: https://dxr.mozilla.org/mozilla-central/rev/a83a4ef50f6ca754ec451320dfefbffa707bad1a/testing/mozharness/scripts/android_emulator_unittest.py#823 and maybe? web-platform-tests: https://dxr.mozilla.org/mozilla-central/rev/a83a4ef50f6ca754ec451320dfefbffa707bad1a/testing/mozharness/scripts/web_platform_tests.py#390 ...not sure if those are StructuredOutputParsers... :)
Attachment #8970446 - Flags: review?(gbrown)
thanks for the review, I had overlooked desktopunittestoutputparser, android, and wpt- I am working on those. the commented out code was accidentally left in, I am cleaning this up and testing it more thoroughly.
Priority: -- → P1
thanks for the previous review, I have fixed this and have adequate coverage on try for various test types and platforms: https://treeherder.mozilla.org/#/jobs?repo=try&revision=cf844b980b22a0cef0823a432db43df443470291
Attachment #8970446 - Attachment is obsolete: true
Attachment #8972293 - Flags: review?(gbrown)
Comment on attachment 8972293 [details] [diff] [review] keep track of existing failures when parsing the file multiple times Review of attachment 8972293 [details] [diff] [review]: ----------------------------------------------------------------- ::: testing/mozharness/mozharness/mozilla/structuredlog.py @@ +112,5 @@ > + if merged[item] <= 0: > + del merged[item] > + return merged > + > + def _add_tuples(self, old, new): Is _add_tuples() unused? ::: testing/mozharness/mozharness/mozilla/testing/unittest.py @@ +15,5 @@ > > +from collections import ( > + defaultdict, > + namedtuple, > +) These imports seem unused. ::: testing/mozharness/scripts/marionette.py @@ +353,5 @@ > output_timeout=1000, > output_parser=marionette_parser, > env=env) > level = INFO > + tbpl_status, log_level, summary = marionette_parser.evaluate_parser( summary is unused (not fed back into evaluate_parser) -- intentional?
Attachment #8972293 - Flags: review?(gbrown) → review+
Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/5cdb10268f47 sometimes when test-verify finds a failure, all future tests are marked as failing also. r=gbrown
this was backed out for failing en-us jobs (and it also failed some selftests which don't get run automatically on try), I did a -u all on try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2a15e1e760550e26a03e412ae215b1cf900d4ae8 and then I fixed the other self tests for python harness code. I see all green and thought...did I break failures- I did examine 30 different logs and there are no real failures there, I am doing more retriggers on bc as that is the higher failure rate.
Attachment #8972293 - Attachment is obsolete: true
Flags: needinfo?(jmaher)
Attachment #8972405 - Flags: review?(gbrown)
Comment on attachment 8972405 [details] [diff] [review] keep track of existing failures when parsing the file multiple times Review of attachment 8972405 [details] [diff] [review]: ----------------------------------------------------------------- ::: testing/mozharness/mozharness/mozilla/structuredlog.py @@ +112,5 @@ > + if merged[item] <= 0: > + del merged[item] > + return merged > + > + def _add_tuples(self, old, new): Hmm? Looks like recommended changes from the first review have regressed. Wrong patch?
Attachment #8972405 - Flags: review?(gbrown)
Attachment #8972630 - Flags: review?(gbrown) → review+
Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/afbec9a6802c sometimes when test-verify finds a failure, all future tests are marked as failing also. r=gbrown
and I ran all possible jobs I could: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d4e096bde0f8915083fc0fa51a2a7b593576be3d the only exception would be if there is a specific job which only runs on a config/platform that we haven't seen yet. I suspect I would have more data from the 2 times this was backed out to help hilight any other jobs.
Attachment #8972630 - Attachment is obsolete: true
Flags: needinfo?(jmaher)
Attachment #8972816 - Flags: review?(gbrown)
Comment on attachment 8972816 [details] [diff] [review] keep track of existing failures when parsing the file multiple times Review of attachment 8972816 [details] [diff] [review]: ----------------------------------------------------------------- It's got to work this time, right?
Attachment #8972816 - Flags: review?(gbrown) → review+
Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/f124a57ef00a sometimes when test-verify finds a failure, all future tests are marked as failing also. r=gbrown
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: